<!DOCTYPE html>
<html lang="en-us">
  <head>

    <meta http-equiv="content-type" content="text/html; charset=utf-8">
    
<meta charset="UTF-8">
<title>Ignoring TF/IDF | Elasticsearch: The Definitive Guide [2.x] | Elastic</title>
<link rel="home" href="index.html" title="Elasticsearch: The Definitive Guide [2.x]">
<link rel="up" href="controlling-relevance.html" title="Controlling Relevance">
<link rel="prev" href="not-quite-not.html" title="Not Quite Not">
<link rel="next" href="function-score-query.html" title="function_score Query">
<meta name="DC.type" content="Learn/Docs/Legacy/Elasticsearch/Definitive Guide/2.x">
<meta name="DC.subject" content="Elasticsearch">
<meta name="DC.identifier" content="2.x">
<meta name="robots" content="noindex,nofollow">
    <meta http-equiv="X-UA-Compatible" content="IE=edge">
    <meta name="viewport" content="width=device-width, initial-scale=1">
    <script src="https://cdn.optimizely.com/js/18132920325.js"></script>
    <link rel="apple-touch-icon" sizes="57x57" href="/apple-icon-57x57.png">
    <link rel="apple-touch-icon" sizes="60x60" href="/apple-icon-60x60.png">
    <link rel="apple-touch-icon" sizes="72x72" href="/apple-icon-72x72.png">
    <link rel="apple-touch-icon" sizes="76x76" href="/apple-icon-76x76.png">
    <link rel="apple-touch-icon" sizes="114x114" href="/apple-icon-114x114.png">
    <link rel="apple-touch-icon" sizes="120x120" href="/apple-icon-120x120.png">
    <link rel="apple-touch-icon" sizes="144x144" href="/apple-icon-144x144.png">
    <link rel="apple-touch-icon" sizes="152x152" href="/apple-icon-152x152.png">
    <link rel="apple-touch-icon" sizes="180x180" href="/apple-icon-180x180.png">
    <link rel="icon" type="image/png" href="/favicon-32x32.png" sizes="32x32">
    <link rel="icon" type="image/png" href="/android-chrome-192x192.png" sizes="192x192">
    <link rel="icon" type="image/png" href="/favicon-96x96.png" sizes="96x96">
    <link rel="icon" type="image/png" href="/favicon-16x16.png" sizes="16x16">
    <link rel="manifest" href="/manifest.json">
    <meta name="apple-mobile-web-app-title" content="Elastic">
    <meta name="application-name" content="Elastic">
    <meta name="msapplication-TileColor" content="#ffffff">
    <meta name="msapplication-TileImage" content="/mstile-144x144.png">
    <meta name="theme-color" content="#ffffff">
    <meta name="naver-site-verification" content="936882c1853b701b3cef3721758d80535413dbfd">
    <meta name="yandex-verification" content="d8a47e95d0972434">
    <meta name="localized" content="true">
    <meta name="st:robots" content="follow,index">
    <meta property="og:image" content="https://www.elastic.co/static/images/elastic-logo-200.png">
    <link rel="shortcut icon" href="/favicon.ico" type="image/x-icon">
    <link rel="icon" href="/favicon.ico" type="image/x-icon">
    <link rel="apple-touch-icon-precomposed" sizes="64x64" href="/favicon_64x64_16bit.png">
    <link rel="apple-touch-icon-precomposed" sizes="32x32" href="/favicon_32x32.png">
    <link rel="apple-touch-icon-precomposed" sizes="16x16" href="/favicon_16x16.png">
    <!-- Give IE8 a fighting chance -->
    <!--[if lt IE 9]>
    <script src="https://oss.maxcdn.com/html5shiv/3.7.2/html5shiv.min.js"></script>
    <script src="https://oss.maxcdn.com/respond/1.4.2/respond.min.js"></script>
    <![endif]-->
    <link rel="stylesheet" type="text/css" href="/guide/static/styles.css">
  </head>

  <!--© 2015-2021 Elasticsearch B.V. Copying, publishing and/or distributing without written permission is strictly prohibited.-->

  <body>
    <!-- Google Tag Manager -->
    <script>dataLayer = [];</script><noscript><iframe src="//www.googletagmanager.com/ns.html?id=GTM-58RLH5" height="0" width="0" style="display:none;visibility:hidden"></iframe></noscript>
    <script>(function(w,d,s,l,i){w[l]=w[l]||[];w[l].push({'gtm.start': new Date().getTime(),event:'gtm.js'});var f=d.getElementsByTagName(s)[0], j=d.createElement(s),dl=l!='dataLayer'?'&l='+l:'';j.async=true;j.src= '//www.googletagmanager.com/gtm.js?id='+i+dl;f.parentNode.insertBefore(j,f); })(window,document,'script','dataLayer','GTM-58RLH5');</script>
    <!-- End Google Tag Manager -->

    <!-- Global site tag (gtag.js) - Google Analytics -->
    <script async src="https://www.googletagmanager.com/gtag/js?id=UA-12395217-16"></script>
    <script>
      window.dataLayer = window.dataLayer || [];
      function gtag(){dataLayer.push(arguments);}
      gtag('js', new Date());
      gtag('config', 'UA-12395217-16');
    </script>

    <!--BEGIN QUALTRICS WEBSITE FEEDBACK SNIPPET-->
    <script type="text/javascript">
      (function(){var g=function(e,h,f,g){
      this.get=function(a){for(var a=a+"=",c=document.cookie.split(";"),b=0,e=c.length;b<e;b++){for(var d=c[b];" "==d.charAt(0);)d=d.substring(1,d.length);if(0==d.indexOf(a))return d.substring(a.length,d.length)}return null};
      this.set=function(a,c){var b="",b=new Date;b.setTime(b.getTime()+6048E5);b="; expires="+b.toGMTString();document.cookie=a+"="+c+b+"; path=/; "};
      this.check=function(){var a=this.get(f);if(a)a=a.split(":");else if(100!=e)"v"==h&&(e=Math.random()>=e/100?0:100),a=[h,e,0],this.set(f,a.join(":"));else return!0;var c=a[1];if(100==c)return!0;switch(a[0]){case "v":return!1;case "r":return c=a[2]%Math.floor(100/c),a[2]++,this.set(f,a.join(":")),!c}return!0};
      this.go=function(){if(this.check()){var a=document.createElement("script");a.type="text/javascript";a.src=g;document.body&&document.body.appendChild(a)}};
      this.start=function(){var a=this;window.addEventListener?window.addEventListener("load",function(){a.go()},!1):window.attachEvent&&window.attachEvent("onload",function(){a.go()})}};
      try{(new g(100,"r","QSI_S_ZN_emkP0oSe9Qrn7kF","https://znemkp0ose9qrn7kf-elastic.siteintercept.qualtrics.com/WRSiteInterceptEngine/?Q_ZID=ZN_emkP0oSe9Qrn7kF")).start()}catch(i){}})();
    </script><div id="ZN_emkP0oSe9Qrn7kF"><!--DO NOT REMOVE-CONTENTS PLACED HERE--></div>
    <!--END WEBSITE FEEDBACK SNIPPET-->

    <div id="elastic-nav" style="display:none;"></div>
    <script src="https://www.elastic.co/elastic-nav.js"></script>

    <!-- Subnav -->
    <div>
      <div>
        <div class="tertiary-nav d-none d-md-block">
          <div class="container">
            <div class="p-t-b-15 d-flex justify-content-between nav-container">
              <div class="breadcrum-wrapper"><span><a href="/guide/" style="font-size: 14px; font-weight: 600; color: #000;">Docs</a></span></div>
            </div>
          </div>
        </div>
      </div>
    </div>

    <div class="main-container">
      <section id="content">
        <div class="content-wrapper">

          <section id="guide" lang="en">
            <div class="container">
              <div class="row">
                <div class="col-xs-12 col-sm-8 col-md-8 guide-section">
                  <!-- start body -->
                  <div class="page_header">
<p>
  <strong>WARNING</strong>: The 2.x versions of Elasticsearch have passed their
  <a href="https://www.elastic.co/support/eol">EOL dates</a>. If you are running
  a 2.x version, we strongly advise you to upgrade.
</p>
<p>
  This documentation is no longer maintained and may be removed. For the latest
  information, see the <a href="https://www.elastic.co/guide/en/elasticsearch/reference/current/index.html">current
  Elasticsearch documentation</a>.
</p>
</div>
<div id="content">
<div class="breadcrumbs">
<span class="breadcrumb-link"><a href="index.html">Elasticsearch: The Definitive Guide [2.x]</a></span>
»
<span class="breadcrumb-link"><a href="search-in-depth.html">Search in Depth</a></span>
»
<span class="breadcrumb-link"><a href="controlling-relevance.html">Controlling Relevance</a></span>
»
<span class="breadcrumb-node">Ignoring TF/IDF</span>
</div>
<div class="navheader">
<span class="prev">
<a href="not-quite-not.html">« Not Quite Not</a>
</span>
<span class="next">
<a href="function-score-query.html">function_score Query »</a>
</span>
</div>
<div class="section">
<div class="titlepage"><div><div>
<h2 class="title">
<a id="ignoring-tfidf"></a>Ignoring TF/IDF<a class="edit_me edit_me_private" rel="nofollow" title="Editing on GitHub is available to Elastic" href="https://github.com/elastic/elasticsearch-definitive-guide/edit/2.x/170_Relevance/35_Ignoring_TFIDF.asciidoc">edit</a>
</h2>
</div></div></div>
<p>Sometimes we just don’t care about TF/IDF.  All we want to know is that a certain
word appears in a field. Perhaps we are searching for a vacation home and we
want to find houses that have as many of these features as possible:</p>
<div class="ulist itemizedlist">
<ul class="itemizedlist">
<li class="listitem">
WiFi
</li>
<li class="listitem">
Garden
</li>
<li class="listitem">
Pool
</li>
</ul>
</div>
<p>The vacation home documents look something like this:</p>
<div class="pre_wrapper lang-json">
<pre class="programlisting prettyprint lang-json">{ "description": "A delightful four-bedroomed house with ... " }</pre>
</div>
<p>We could use a simple <code class="literal">match</code> query:</p>
<div class="pre_wrapper lang-json">
<pre class="programlisting prettyprint lang-json">GET /_search
{
  "query": {
    "match": {
      "description": "wifi garden pool"
    }
  }
}</pre>
</div>
<p>However, this isn’t really <em>full-text search</em>. In this case, TF/IDF just gets
in the way.  We don’t care whether <code class="literal">wifi</code> is a common term, or how often it
appears in the document.  All we care about is that it does appear.
In fact, we just want to rank houses by the number of features they have—​the more, the better. If a feature is present, it should score <code class="literal">1</code>, and if it
isn’t, <code class="literal">0</code>.</p>
<div class="section">
<div class="titlepage"><div><div>
<h3 class="title">
<a id="constant-score-query"></a>constant_score Query<a class="edit_me edit_me_private" rel="nofollow" title="Editing on GitHub is available to Elastic" href="https://github.com/elastic/elasticsearch-definitive-guide/edit/2.x/170_Relevance/35_Ignoring_TFIDF.asciidoc">edit</a>
</h3>
</div></div></div>
<p>Enter the <a href="/guide/en/elasticsearch/reference/2.4/query-dsl-constant-score-query.html" class="ulink" target="_top"><code class="literal">constant_score</code></a> query.
This query can wrap either a query or a filter, and assigns a score of
<code class="literal">1</code> to any documents that match, regardless of TF/IDF:</p>
<div class="pre_wrapper lang-json">
<pre class="programlisting prettyprint lang-json">GET /_search
{
  "query": {
    "bool": {
      "should": [
        { "constant_score": {
          "query": { "match": { "description": "wifi" }}
        }},
        { "constant_score": {
          "query": { "match": { "description": "garden" }}
        }},
        { "constant_score": {
          "query": { "match": { "description": "pool" }}
        }}
      ]
    }
  }
}</pre>
</div>
<p>Perhaps not all features are equally important—​some have more value to
the user than others. If the most important feature is the pool, we could
boost that clause to make it count for more:</p>
<div class="pre_wrapper lang-json">
<pre class="programlisting prettyprint lang-json">GET /_search
{
  "query": {
    "bool": {
      "should": [
        { "constant_score": {
          "query": { "match": { "description": "wifi" }}
        }},
        { "constant_score": {
          "query": { "match": { "description": "garden" }}
        }},
        { "constant_score": {
          "boost":   2 <a id="CO111-1"></a><i class="conum" data-value="1"></i>
          "query": { "match": { "description": "pool" }}
        }}
      ]
    }
  }
}</pre>
</div>
<div class="calloutlist">
<table border="0" summary="Callout list">
<tr>
<td align="left" valign="top" width="5%">
<p><a href="#CO111-1"><i class="conum" data-value="1"></i></a></p>
</td>
<td align="left" valign="top">
<p>A matching <code class="literal">pool</code> clause would add a score of <code class="literal">2</code>, while
the other clauses would add a score of only <code class="literal">1</code> each.</p>
</td>
</tr>
</table>
</div>
<div class="note admon">
<div class="icon"></div>
<div class="admon_content">
<p>The final score for each result is not simply the sum of the scores of
all matching clauses.  The <a class="xref" href="practical-scoring-function.html#coord" title="Query Coordination">coordination factor</a> and
<a class="xref" href="practical-scoring-function.html#query-norm" title="Query Normalization Factor">query normalization factor</a> are still taken into account.</p>
</div>
</div>
<p>We could improve our vacation home documents by adding a <code class="literal">not_analyzed</code>
<code class="literal">features</code> field to our vacation homes:</p>
<div class="pre_wrapper lang-json">
<pre class="programlisting prettyprint lang-json">{ "features": [ "wifi", "pool", "garden" ] }</pre>
</div>
<p>By default, a <code class="literal">not_analyzed</code> field has <a class="xref" href="scoring-theory.html#field-norm" title="Field-length norm">field-length norms</a>
disabled and has <code class="literal">index_options</code> set to <code class="literal">docs</code>, disabling
<a class="xref" href="scoring-theory.html#tf" title="Term frequency">term frequencies</a>, but the problem remains: the
<a class="xref" href="scoring-theory.html#idf" title="Inverse document frequency">inverse document frequency</a> of each term is still taken into account.</p>
<p>We could use the same approach that we used previously, with the <code class="literal">constant_score</code>
query:</p>
<div class="pre_wrapper lang-json">
<pre class="programlisting prettyprint lang-json">GET /_search
{
  "query": {
    "bool": {
      "should": [
        { "constant_score": {
          "query": { "match": { "features": "wifi" }}
        }},
        { "constant_score": {
          "query": { "match": { "features": "garden" }}
        }},
        { "constant_score": {
          "boost":   2
          "query": { "match": { "features": "pool" }}
        }}
      ]
    }
  }
}</pre>
</div>
<p>Really, though, each of these features should be treated like a filter.  A
vacation home either has the feature or it doesn’t—​a filter seems like it
would be a natural fit.  On top of that, if we use filters, we can
benefit from filter caching.</p>
<p>The problem is this: filters don’t score. What we need is a way of bridging
the gap between filters and queries. The <code class="literal">function_score</code> query does this
and a whole lot more.</p>
</div>

</div>
<div class="navfooter">
<span class="prev">
<a href="not-quite-not.html">« Not Quite Not</a>
</span>
<span class="next">
<a href="function-score-query.html">function_score Query »</a>
</span>
</div>
</div>

                  <!-- end body -->
                </div>
                <div class="col-xs-12 col-sm-4 col-md-4" id="right_col">
                  <div id="rtpcontainer" style="display: block;">
                    <div class="mktg-promo">
                      <h3>Most Popular</h3>
                      <ul class="icons">
                        <li class="icon-elasticsearch-white"><a href="https://www.elastic.co/webinars/getting-started-elasticsearch?baymax=default&amp;elektra=docs&amp;storm=top-video">Get Started with Elasticsearch: Video</a></li>
                        <li class="icon-kibana-white"><a href="https://www.elastic.co/webinars/getting-started-kibana?baymax=default&amp;elektra=docs&amp;storm=top-video">Intro to Kibana: Video</a></li>
                        <li class="icon-logstash-white"><a href="https://www.elastic.co/webinars/introduction-elk-stack?baymax=default&amp;elektra=docs&amp;storm=top-video">ELK for Logs &amp; Metrics: Video</a></li>
                      </ul>
                    </div>
                  </div>
                </div>
              </div>
            </div>
          </section>

        </div>


<div id="elastic-footer"></div>
<script src="https://www.elastic.co/elastic-footer.js"></script>
<!-- Footer Section end-->

      </section>
    </div>

<script src="/guide/static/jquery.js"></script>
<script type="text/javascript" src="/guide/static/docs.js"></script>
<script type="text/javascript">
  window.initial_state = {}</script>
  </body>
</html>
